How to measure CMB polarization power spectra without losing information 
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We present a method for measuring CMB polarization power spectra given incomplete sky coverage 
and test it with simulated examples such as Boomerang 2001 and MAP. By augmenting the quadratic 
estimator method with an additional step, we find that the E and B power spectra can be effectively 
disentangled on angular scales substantially smaller than the width of the sky patch in the narrowest 
direction. We find that the basic quadratic and maximum-likelihood methods display a unneccesary 
sensitivity to systematic errors when T — E cross-correlation is involved, and show how this problem 
can be eliminated at negligible cost in increased error bars. We also test numerically the widely 
used approximation that sample variance scales inversely with sky coverage, and find it to be an 
excellent approximation on scales substantially smaller than the sky patch. 
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I. INTRODUCTION 

As experimental groups roar ahead to map the CMB 
intensity with increasing resolution and sensitivity, a sec- 
ond parallel front is being opened up: CMB polariza- 
tion. Theoretical advances now allow model predictions 
for CMB polarization to be computed with exquisite ac- 
curacy and it is known that polarization measure- 
ments can substantially improve the accuracy with which 
parameters can be measured over the no-polarization 
case by breaking the degeneracy between certain param- 
eter combinations . CMB polarization can also pro- 
vide crucial cross-checks and tests of the underlying the- 
ory. After placing ever tighter upper limits a num- 
ber of experimental teams are likely to make the first 
detections of CMB polarization in the next year or two, 
about a decade after unpolarized CMB fluctuations were 
first detected ||. It is therefore timely to address real- 
world issues related to extracting polarized power spectra 
from experiments with incomplete sky coverage and com- 
plicated noise. This is the purpose of the present paper, 
with emphasis on the problem of separating the two po- 
larization signals known as E and B jj],0|. Important 
steps in this direction were first taken in |l^,|ll| , for the 
special case of experiments mapping circles in the sky. 

The E/B problem goes back to the fact that the po- 
larization tensor field on the sky can be separated into 
a curl-free and a divergence free component, and is most 
naturally expressed in terms of two scalar fields, denoted 
E and B by analogy with electromagnetism Not 
only does this separation eliminate the coordinate system 
dependence that plagues the familiar Stokes parameters, 
but E and B also probe distinct physical effects, making 
them the natural meeting point for theory and observa- 
tion. Unfortunately, the correspondence between E and 
B and the measured Stokes parameters is not spatially lo- 
cal, involving a partial differential equation, which means 
that it is not possible to uniquely recover E and B from 
a map with merely partial sky coverage. This issue is 
particularly important since the -B-signal from inflation- 
produced gravity waves potentially offers a unique probe 



of ultra-high-energy physics, but may be swamped by a 
leakage from a larger i^-signal Jl^,[l3| ■ 

A key goal of CMB analysis is to constrain cosmological 
models, and information-theoretical methods have been 
frequently employed in the literature to study how accu- 
rately this can be done in principle with a given data set, 
using the Fisher information matrix formalism JT^ , p^| . 
For CMB polarization experiments, this has been useful 
both for optimizing experimental design jl6|,[l7| and for 
accuracy forecasting in general Q. These information- 
theoretical tools are equally useful for data analysis, since 
they provide a simple way of checking whether cosmologi- 
cal information is being lost in the data analysis pipeline. 
Each step in such a pipeline typically compresses the in- 
put data into a smaller set of numbers, and if the output 
can be shown to retain all the cosmological information 
of the input, the method is said to be lossless. Lossless 
methods have been developed and extensively tested for 
the unpolarized case, for both mapmaking jl8],[l9| and 
power spectrum estimation |2(],|2l]]. As we will see, it is 
possible to draw heavily on these methods for the po- 
larized case as well, although a number of adaptations 
make them simpler to interpret and help improve the 
.E/-B-separation and robustness towards systematic er- 
rors. 

The rest of this paper is organized as follows. In Sec- 
tion U, we discuss basic methods involved in polarization 
data analysis. To keep the presentation from becoming 
too abstract, many method details and extensions are 
postponed to Section III , where they are illustrated with 
plots from actual applications. This section also assesses 
the effectiveness of the various methods numerically, with 
applications to five experimental examples. A step-by- 
step summary of how to compute the signal covariance 
matrix is given in Appendix A for the reader wishing to 
do so in practice. Our conclusions are summarized in 



Section IV 
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II. METHOD 

In this section, we first establish some basic nota- 
tion, then discuss the extraction of maps from raw time- 
ordered data, and finally cover the extraction of power 
spectra from maps. 



A. Notation 

The linear polarization pattern of the CMB sky is char- 
acterized by the two Stokes parameters Q and U in each 
sky direction. Q and U are the components of a rank 
2 tensor (spinor), loosely speaking a vector without an 
arrow on it, so Q and U maps are defined only relative 
to a convention providing a reference direction at each 
point in the sky. Let us discretize the T, Q and U maps 
into N pixels centered at unit vectors ?i, rjv, and write 
Ti = T(r t ), Q t = QfTi), Ui = U(v % ) (we let T denote the 
unpolarized intensity, often called I). We group these 
numbers into three A-dimensional vectors T, Q and U 
and group these in turn into a single 3-ZV-dimensional vec- 
tor 




(1) 



The statistical properties of x have been computed in full 
detail in the literature , and are characterized by six 
separate power spectra: Cj for the unpolarized signal 
T, Cf for the ^-polarization, Cf for the B-polarization, 
and Cj E , Cf B and Cf B for the three possible cross- 
correlations. The power spectra Cj B and Cf B are both 
predicted to vanish for the CMB, but it will be inter- 
esting to measure them nonetheless, as probes of polar- 
ized foregrounds and exotic parity- violating physics |pl| . 
We will occasionally refer to the three cross-correlations 
{TE, TB, EB) as (X, Y, Z), respectively. We will find it 
useful for data analysis purposes to recast the polariza- 
tion problem in exactly the same mathematical form as 
the simpler unpolarized case, encoding all complications 
in a set of matrices. The vector x has a vanishing expec- 
tation value ((x) = 0), and we can write its covariance 
matrix as 



C = 



(2) 



for a set of parameters pi and known matrices Pi. If the 
six observed power spectra are negligibly small for all 
multipoles £ above some value £ max (which is always the 
case because of the smoothing caused by the finite an- 
gular resolution of an experiment), then we define these 
parameters to be 
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(3) 



(4) 



is the familiar rescaled power that is normally used 
in power spectrum plots in place of C[ . The index 
P denotes any of the six power spectrum types, i.e., 
P = T,E, B, TE, TB, EB. The parameter r] is the nor- 
malization of the detector noise in the maps relative to 
the predicted value, and is normally equal to unity. Since 
C depends linearly on the parameters p t , we can define 
the P-matrices as 



dC 

dpi' 



(5) 



In other words, the first 6(£ max — 1) P-matrices give the 
contributions from the T, E, B, TE, TB and EB power 
spectra, and the last one is simply the noise covariance 
matrix, giving the contribution from experimental noise. 

Appendix A summarizes how to compute the C- 
matrix, and is intended for the reader who wishes to 
write software to do this in practice. The P-matrix cor- 
responding to ST 2 ^ is obtained from these formulas by 
simply setting all power spectra to zero, with the single 
exception <5T 2 f = 1, i.e., C'[ = 2n/£(£ + 1). These P- 
matrices are therefore independent of the actual power 
spectra, and depend merely on the relative orientations 
of the map pixels. 



B. Background: from timestream to T, Q &: U maps 

To place our problem in context, this section briefly 
reviews how to reduce experimental data to maps in the 
form of equation ([!]). Although it is widely known how 
to do this, we will see that there are some subtle issues 
related to unmeasured modes. 



1. The basic inversion 

Suppose we have observed the sky a large number of 
times with (perhaps) polarized detectors in a variety of 
different orientations. Let yi denote the number mea- 
sured in the i th observation, and group this time-ordered 
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data set (TOD) into an M-dimensional vector y. The ob- 
served temperature fluctuation yi seen through a linear 
polarizer takes the form 



1 



[T kl + Qk, cos(2a;) + U ki sin(2 ai )] + m, (6) 



where cti gives the clockwise angle between the polar- 
izer and the reference direction of the coordinate system, 
rii denotes the detector noise, and fcj is the number of 
the pixel pointed to during the i th observation. If the 
experiment instead measures the difference between two 
perpendicular polarizations, yi takes the simpler form 



Vi 



}ki cos(2o!i) + sin(2ai) + rn, 



(7) 



An unpolarized experiment is described by yi — T ki + n%. 
Grouping the numbers into vectors, we can write any of 
these three expressions as a simple matrix equation 



y = Ax + n, 



(8) 



where the matrix A encompasses all the relevant details 
of the observations. For a pure polarization experiment 
as in equation (Q), A will contain only zeroes except for a 
single sine and cosine entry in each row, in columns cor- 
responding to the pixel observed. More general observ- 
ing strategies such as the beam-differencing of the MAP 
Satellite or modulated beams clearly retain the simple 
form of equation (Q) , merely with a slightly more compli- 
cated (but known) A-matrix. For a well-designed exper- 
iment such as MAP, the system of equations (^) is highly 
overdetermined, and the estimate x of the map triplet x 
given by the familiar equation E3] 



x = Wy, W = [A t MA] _1 A*M 



(9) 



is unbiased ((x) = WAx + W(n) = x since WA = I). 
If M a reasonable approximation to N _1 , then the map 
noise Wn will have minimum variance to first order, with 
covariance matrix S = WNW* « [A t N~ 1 A]~ 1 . Both 
the map triplet x and its exact noise covariance matrix X 
can be computed in ~ N 3 time, and even faster in many 
important cases p^| , p2] -^o|. Note that the last P-matrix 
is this noise covariance matrix, i.e., Pe^™— s = S. 



2. The problem of missing modes 

In many cases, the inversion in equation (^) fails be- 
cause the matrix to be inverted is singular. Although 
there are typically much more measurements yi than un- 
knowns Xi (M 3> 3iV), symmetries or other properties 
of the observing strategy often imply that A*x = for 
certain vectors x, i.e., that A is singular. A ubiquitous 
example is lack of sensitivity to the mean (monopole) 
in the map. Experiments measuring linear polarization 
without cross-linking may be sensitive to only a certain 
linear combination of Q and U, unable to recover the two 



separately. As another example, the PIQUE experiment 
measures only sums of Q-values 90° apart on a circle 
in the sky, thereby losing information about modes tak- 
ing values (+1,-1, +1,-1) at four corners of a square. 

All such problems can in principle be dealt with by reg- 
ularizing the inversion (see, e.g., the Appendix of pof ), 
which sets the unmeasurable modes to zero in the final 
maps, and keeping track of which modes are missing dur- 
ing subsequent analysis of x. In practice, however, it is of- 
ten more convenient to eliminate this extra bookkeeping 
requirement by encoding the corresponding information 
in the noise covariance matrix S as near-infinite noise 
for the missing modes. This ensures that the missing 
modes are given essentially zero weight in any subsequent 
analysis. This "deconvolution" technique is described in 
detail and tested numerically in |31[ , and in practice cor- 
responds to adding a matrix <r nfto the [A'MA]-term 
of equation (||) before the inversion, where a is about 10 2 
times the rms cosmological signal. In summary, it allows 
any observed data set to be put in the standard form of 
equation ([!]), described fully by the pair (x, S) regardless 
of any missing modes. 



C. Measuring the power spectra with quadratic 
estimators 

In this section, we discuss how to measure the six power 
spectra Cj, Cf , Cf , Cj E , Cj B and Cf B from the map 
triplet x of equation ([!]) . There are two basic approaches 
to this problem that ultimately give the same answer. 

The first approach is to start by deconvolving the Q 
and U maps into E and B maps, and then use these as 
inputs to the power spectrum estimation. Since Q and 
U depend linearly (but non-locally) on E and B, this 
can always be done with the deconvolution method of 
pl[ . Incomplete sky coverage will simply be reflected by 
near-infinite noise in certain modes in the resulting noise 
covariance matrix. An advantage of taking this route 
is that Wiener-filtered E and B maps can be plotted, 
whose spatial information may provide useful diagnostics 
for foreground contamination and systematic errors. 

The second approach, which we will adopt here, is to 
skip the intermediate step of E and B maps and measure 
the power spectra directly from x. 



1. The definition of a quadratic estimator 

Our basic problem is to estimate the parameters Pi 
in equation (g) from the observed data set x (we drop 
the tilde from Section II B for simplicity). Fortunately, 
this problem is mathematically identical to that for the 
unpolarized case, which has already been solved using so- 
called quadratic estimators . This class of methods 
is closely related to the maximum-likelihood method — 
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we return to this issue in Section [V. A quadratic esti- 
mator qi is simply a quadratic function of the data vector 
x, so the most general case can be written as 



qi = x'QiX = tr [QjXX*] 



(10) 



where Qi an arbitrary symmetric 3N x 3./V-dimensional 
matrix. We will often find it convenient to group the 
parameters pi and the estimators qi into iVb-dimensional 
vectors, denoted p and q, where Nf, = 6^ max — 5 is the 
number of bands. The matrices Qi should not be con- 
fused with the vector of stokes parameters Q from equa- 
tion (§! 



2. The window function of a quadratic estimator 

Since Qi can be any symmetric matrix, one can 
write down infinitely many different quadratic estima- 
tors. Whether a given choice is useful or not depends on 
the mean and covariance of the vector q. Equations (ITT 
and (Q) show that the mean of q is 



= tr [QiC] = J2 tr [Qi p i'fc' 

i' 



(11) 

(12) 



p'=i t'=i 



where i = {£ max — l){P—l)-\-l—l is the parameter number 
corresponding to polarization type P and multipole i, 
b = tr [QiS] is the contribution from experimental noise, 
and 



W" P P , 



tr [Q l P l 



(13) 



These quantities can be viewed as a generalized form of 
window functions, since for a fixed (P,£), they show the 
expected contributions to qi not only from different l- 
values, but also from different polarization types. 

Ideally, we would be able to estimate Cf by apply- 
ing a quadratic estimator with the perfect window func- 
tion WgFpi — Spp'Su', but this is often impossible or 
undesirable with incomplete sky coverage, shifting the 
aim to making the window functions narrow in both the 
^-direction and the P-direction. Minimizing such un- 
wanted mixing of different polarization types is one of 
the key topics of this paper, and numerous examples of 



such window functions will be plotted in Section III 
The covariance matrix of q is 



(qq t )-(q)<q) t = 2tr[Q,CQ J C] 



(14) 



for the case where x is Gaussian, and it is clearly desirable 
to make it small in some sense. 



3. Quadratic estimators: specific examples 

It can be shown [^0| that the quadratic estimator de- 
fined by 



Qi = jM^(B)i J -C- 1 P i C- 1 , 



(15) 



distills all the cosmological information from x into the 
(normally much shorter) vector q if C is the true covari- 
ance matrix. Moreover, if C is a reasonable estimate of 
the true covariance matrix, say by computing it as in Ap- 
pendix A using a prior power spectrum consistent with 
the actual measurements, then the data compression step 
of going from x to q de stroys information only to second 
order. In equation (|15|), B is an arbitrary invertible ma- 
trix, and the normalization constants Mi are chosen so 
that all window functions sum to unity: 



N h 



J2 tr [Q< 



(16) 



This means that we can interpret qi as measuring a 
weighted average of our unknown parameters, the win- 
dow giving the weights. We will discuss a number of 



choices of B in Section III that have various desirable 
properties. B = I gives minimal but correlated error 
bars. B = F _1 gives beautiful Kronecker-delta window 
functions, corresponding to (q) = p at the price of anti- 
correlated and typically very large error bars, where 



1 



-tr 



c -i«C c -iflC 

dpi dpj 



(17) 



is the so-called Fisher information matrix |bj,|l5j for the 
case where the CMB fluctuations are Gaussian. The in- 
termediate choice B = F -1 / 2 is normally a useful com- 
promise |3^ 1, giving uncorrelated error bars and narrow 
window function with width A£ of order the inverse map 
size. We will describe and test additional choices of B in 



Section III C and Section [II D 
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4- Broadening the bands 



For CMB maps of small size where the window func- 
tion width A£ ^ 1, it is unnecessary to oversample the 
measured power with a separate parameter pi at each 
multipole I. In such cases, it is useful to parametrize 
the power spectrum as a staircase-shaped (piecewise con- 
stant) function, with the parameters pi giving the heights 
of the various steps pi|. We will occasionally do this in 



Section III 



III. RESULTS 

In this section, we will apply the quadratic estimator 
method to a variety of fictitious data sets to quantify 
how experimental attributes such as sky coverage and 
sensitivity affect the ability to measure and separate the 
different power spectra. We will also describe two ways 
in which the basic quadratic estimator technique can in 
some circumstances be improved for polarization appli- 
cations. 



O 0.02 



a 

• 0.04 



B 



B 



20 40 60 80 100 120 

Multipole 1 

FIG. 1. The window function corresponding to the B2001 mea- 
surement of 8T 2 1 for £ = 20. Upper panel shows sensitivity to 
E-power (wanted) and lower panel shows sensitivity to B-power 
(unwanted — what we call "leakage"). 
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FIG. 2. Same as Figure [l], but for the B2001 i?-measurement 
aimed at i = 70. 



A. Case studies 

Our five case studies are listed in Table 1. The first 
three cover the northern Galactic cap b > 20° with suc- 
cessively higher sensitivity. The fourth covers a much 
smaller cap b > 80°, and the fifth covers merely a one- 
dimensional region: the circle defined by b — 80°. These 
case studies are not intended to be accurate forecasts 
for the actual performance of the experiments listed, but 
rather to span an interesting range in sensitivity, sky cov- 
erage and map shape. There are therefore numerous de- 
partures from realism. For instance, the actual maps 
from COBE, MAP and Planck will of course include the 
southern Galactic caps as well — apart from reducing 
the error bars, adding this reflection symmetry to the 
sky maps eliminates all leakage between even and odd 
values p0| , preserving the overall width of the window 
functions that we will present but giving them a jagged 
behavior where every other entry vanishes. The actual 
map from the Boomerang 2001 ("B2001") experiment 
will not be round and will have non-uniform sensitivity. 
We assume uncorrelated pixel noise for simplicity. Most 
of the experimental sensitivities we have used are likely 
to be slight underestimates, being based on a single fre- 
quency channel. 

In all cases, we explicitly perform the various matrix 
computations described in Section [n| The reason that 
this is numerically feasible within the scope of this paper 
is that the large angular scales of interest here allow us to 
use larger and fewer pixels than the experimental teams 
will employ in their actual data reduction. 

We will first study window functions to quantify how 
accurately E and B can be separated in various cases. We 
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will then discuss measurement of the cross power spec- 
trum and finally investigate how accurately approximate 
error bars from the Fisher matrix formalism match the 
results from our full numerical calculation. 



B. E and B window functions 

We begin by quantifying the ability of B2001 and 
MAP to separate E and B using Q- and [/-maps. We 
pixelize our sky patches using the equal-area icosahedron 
method at resolution levels 35 and 7, respectively, 
corresponding to 361 B2001 pixels and 561 MAP pixelsf]. 
Since Q and U are measured for each pixel, the data 
vectors x have twice these lengths. We use the method 
given by equation (|l5|) with B = F -1 / 2 unless other- 
wise specified. We compute fiducial power spectra CT , 



Cf and Cj E , with the CMBFAST software fj! 



using 



cosmological parameters from the "concordance" model 
from J36| , which provides a good fit to existing CMB and 
large scale structure data. We set Cj B — Cf B = 0. 
Although the true B-power spectrum may be close to 
zero, we set C B — C B in our fiducial model since we 
wish to highlight geometrical effects. Since this prior is 
-EyS-symmetric, any asymmetries between E and B in 
our resulting window functions and error bars will be due 
to geometry alone. We eliminate sensitivity to offsets by 
projecting out the mean (monopole) for the T, Q and U 
maps separately, as described in the Appendix of ]2Q]. 
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FIG. 4. Same as Figure [l], but for the MAP _B-measurement 
aimed at i = 5. 
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FIG. 5. Same as Figure [I| but for the MAP measurement of 
ST 2 f for I = 5. 



Multipole 1 

FIG. 3. Same as Figure but for the MAP ^-measurement 
aimed at £ = 2. 



* We use the icosahedron pixelization since it has the round- 
est (mainly hexagonal) pixels and is highly uniform. Although 
we did not use it here, the HEALPIX package [[mJ offers a use- 
ful alternative, allowing azimuthal symmetry to be exploited 
for saving computer time. 



1. Dependence on sky coverage and angular scale 

Figures show a sequence of sample window func- 
tions for B2001 and MAP. Note that it is possible for 
window functions to go slightly negative for the decorre- 
lation method B = F -1 / 2 used here, whereas the method 
given by B = I guarantees non-negative windows. Just 
as for the unpolarized case, the window function width 
A£ is seen to be fairly independent of the target multi- 
pole £, essentially scaling as the inverse size of the sky 
patch covered 
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The amount of leakage of i?-power into our E estimate 
is quantified by the lower panels in Figures and is 
seen to decrease as smaller scales are probed. Figure || 
targets B-polarization and looks like Figure [| with the 
two panels swapped, thereby showing that the leakage 
problems between E and B are quite symmetric. The 
smaller the area under the unwanted half of the window 
function, the better our method separates E and B. As 
a simple quantitative measure of this power leakage, let 
us therefore define a 2 x 2 leakage matrix 1/ for each £, 
given by 



j pp , 



(18) 



'=2 



where P and P' take the values E, B. In other words, 
the four components of this leakage matrix are the ar- 
eas under the four histograms in Figure ^ and 0. If 

L EE = L BB = 1 and L EB = L BE = °! Le -i if L = ^ 

then there is on average no leakage at all between E and 
B. For the simple case of complete sky coverage and 
uniform noise, all window functions become Kronecker 



delta functions, Wf, p P , — Spp'Sw, and we verified that 
this happens numerically as a test of our software (in 
practice, it works only when £, £' are smaller than the 
scale corresponding to the pixel separation, i.e., when 
the map is adequately oversampled) . This simple case 
thus gi ves th e ideal case L = I, but we will return in 
Section [II C to a method producing this desirable result 
even for partial sky coverage. 



and angular scale, we plot the leakage for B2001, MAP 
and CIRCLE as a function of I in Figure [|. Specifically, 
we plot the ratios of unwanted to wanted contributions, 
i.e., L EB /Li EE and L BE /Ti BB . These plots show three 
noteworthy results: 



1. The situation for E and B is rather symmetric, 
with essentially equal leakage from B to E as vice 
versa. 

2. The leakage drops with £. 

3. The B2001 and MAP curves have roughly similar 
shape apart from a scaling of the horizontal axis by 
a factor ~ 7, corresponding to the map size ratio. 

Result 2 is expected since map boundary effects (incom- 
plete sky coverage) are the reason that we cannot sepa- 
rate E and B perfectly — these boundary effects become 
less important as angular scales much smaller than the 
map are considered. In the small-scale limit where sky 
curvature and discreteness of £ become irrelevant, one 
would expect result 3 as well, since there is no other £- 
scale in the problem than the window function width A£, 
of order the inverse size of the map. If 6 denotes the di- 
ameter of our circular sky patches in radians, then the 
FWHM window widths for B2001 and MAP are roughly 
fit by A£ sw 5/9, and the figures show that the leakage 
ratio drops below 15% for £ > 2A£. (Things are differen t 
for the CIRCLE case, which we defer to Section |EII B 2| .) 

In conclusion, we have found that E/B separation 
works well for £ 3> A£. 
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FIG. 6. The amount of leakage of S-power into the _B-power 
spectrum estimates is shown for the cases of B2001 (top panel, 
solid), CIRCLE (top panel, dashed) and MAP (bottom panel, 
solid). These curves show the B/E ratio Q'ebI Ee) ^ or ^ e 
^-estimates. The corresponding curves for leakage in the reverse 
direction (the _E/_B-ratio Lg B /Lg B for the B-estimates) are visu- 
ally identical. 
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FIG. 7. Same as Figure [l| but for the CIRCLE _B-measurement 
aimed at I = 20. 



To assess how the leakage depends on sky coverage 
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3. Dependence on sensitivity 




20 40 60 80 100 120 



Multipole 1 

FIG. 8. Same as Figure but for the CIRCLE ^-measurement 
aimed at I = 70. 



2. Dependence on map shape 



Above we investigated how E / B -leakage was affected 
by map size and shape. To assess the effect of map sen- 
sitivity, we compare our COBE, MAP and Planck exam- 
ples. These have identical sky coverage and pixelization, 
so the only difference is the sensitivity per unit area which 
increases dramatically from COBE to MAP to Planck. 
We refrain from plotting the three leakage curves, since 
they look visually identical to the MAP curve in Figure ||. 
This means that the effect of sensitivity on E/B separa- 
tion is negligible compared to the effect of sky coverage. 
In other words, it depends mainly on geometry and only 
weakly on the (sensitivity-dependent) details of the pixel 
weighting. 

Generally, the quadratic estimator method strives to 
minimize error bars by reducing leakage from multipolcs 
and polarization types with substantial power. In a sit- 
uation where sample variance is dominant, this tends to 
make windows slightly lopsided, with a wider wing to- 
wards the direction where power decreases — in most 
cases towards the right. Conversely, in a situation where 
detector noise is dominant, windows tend to be slightly 
lopsided in the opposite sense, since noise power normally 
increases on smaller scales. 



Above we studied how leakage problems depend on 
map size. To assess how they depend on map shape, 
we will now compare two rather extreme examples: a 
disc and a circle. The B2001 and CIRCLE maps have 
the same diameter and thus probe comparable angu- 
lar scales. However, whereas the B2001 map is truly 
two-dimensional, the CIRCLE map is essentially one- 
dimensional, containing merely a single strip of pixels 
along the circumference. The two current polarization 
experiments POLAR Q and PIQUE both use ring- 
shaped maps, and this important case has also been ex- 
tensively studied theoretically jljj ]. 

We pixelize our CIRCLE map with 360 equispaced 
pixels around the circle. Sample CIRCLE window func- 
tions arc shown in Figures 0and|8[ It is seen that the 
CIRCLE windows are generally broader (with larger A£) 
than their B2001 counterpart, which agrees with the well- 
known rule of thumb [p3| that the narrowest dimension of 
a map is the limiting factor. We also see that whereas the 
B2001 leakage reduced on smaller scales, things do not 
get correspondingly better for the CIRCLE case. This 
explains the dashed curve in Figure ||, which shows that 
subs tantia l leakage persists even on small scales. In Sec- 
tion III C , we will describe how leakage can be further 



reduced. 

In conclusion, we find that although ring maps do 
allow an interesting degree of E'/iJ-separation, a two- 
dimensional map works better in this regard. 



C. Disentangling E and B better 

Above we have seen that the E and B power spectra 
can be fairly accurately separated on angular scales t 3> 
A£ with the decorrelated quadratic estimator method. 
Here we will argue that it is in some cases possible to do 
even better. Our motivation for pursuing this is that 
whereas broad windows in the ^-direction are easy to 
interpret (corresponding simply to a smoothing of the 
power spectrum), the mixing of different polarization 
types is rather annoying and complicates the interpre- 
tation of results. 

Each choice of B in equation ( |l5| ) corresponds to a 
different way of plotting the results. Which is the best 
choice? 

If the goal is to use the measured band power estimates 
in p to constrain cosmological parameters, the choice is 
irrelevant. Any two methods using invertible B-matrices 
of course retain exactly the same cosmological informa- 
tion, since it is possible to go back and forth between the 
corresponding two p- vectors by multiplying by B2B7 1 or 
BiB^ 1 . This means that the likelihood function for cos- 
mological parameters will be identical for the two meth- 
ods. 

The choice of B is therefore mainly a matter of pre- 
senting the power spectrum measurements in a clear and 
intuitive way. Ideally, a method would have the following 
properties 



1. No leakage between E and B 
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2. Narrow window functions 



3. Uncorrelated error bars 

Unfortunately, these three properties are only achievable 
simultaneously (without information loss) for the case of 
complete sky coverage. As discussed in |3^], the choice 
B = F -1 / 2 in equation ([[5]) achieves 3 and does a fairly 
good job on 2, giving window functions with the funda- 
mental ^-resolution corresponding to the sky coverage. 
Above we saw that this gives a leakage between E and B 
that is substantial on scales corresponding to the size of 
the survey and approaches zero on substantially smaller 
scales. 

In contrast, the choice B = F _1 in equation ( 113 ) 
achieves 1 and 2 perfectly, but at the cost of produc- 
ing measurements that are difficult to interpret because 
the error bars are typically anticorrelated and huge |32| . 
However, these problems are to a large extent caused by 
eliminating the rather benign leakage between different 
^-values. We will now describe a less aggressive method 
that merely targets the E/B- leakage. Specifically, let us 
insist that the leakage matrices 1/ = I for all I. 

Let us define the disentangling method as the one given 
by equation (^) with the B-matrix chosen as 

Bjj/ = Bpfp^/ = ^(L f 1 )pp»Fp, 1 ,^p, £ ,. (19) 
p" 

Here we have once again combined P and £ into a single 
index i = (£ max — 1)(P — 1) + 1 — 1. It is easy to show 
that this method gives ideal leakage matrices 1/ = I if the 
leakage matrices in equation ( |l9| ) were computed using 
B = F -1 / 2 . This means that the unwanted half of the 
window function averages to zero. A similar scheme was 
found to work well for disentangling three types of power 
in a galaxy redshift survey fl38| ]. 
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FIG. 9. Same as Figure |l| but after applying our disentangle- 
ment method. 
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FIG. 10. Curves showing leakage of B-power into estimates of E 
are plotted for B2001 before (solid) and after (dashed) applying out 
disentanglement method. In this plot, leakage has been computed 
by taking absolute values of the window functions — otherwise the 
dashed curve would be identically zero. For example, the value of 
the dashed curve at I = 20 can be interpreted simply as the ratio 
of shaded area in the two panels of Figure ^j. 

We test this method for our B2001 example, and a sam- 
ple disentangled window function is shown in Figure |^. 
Comparing this with Figure |l|, which targeted the same 
multiple, we see that the leakage (lower panel) has been 
substantially reduced and oscillates around zero. On very 
large angular scales £ ~ A£, this undesirable half of the 
window function remains substantial even though it by 
construction averages to zero. On small scales, however, 
the unwanted part of the window function is found to be 
consistently near zero, not merely on average. This is 
because both the desirable and the undesirable halves of 
the initial window function before disentanglement have 
essentially the same shape, so that our disentanglement 
process will cancel them out almost completely. To quan- 
tify how well the this process works, Figure |l^ shows 
leakage curves with the leakage matrix redefined with 
absolute values: 

( m „ 

L£ p , = £|W&,|, (20) 
l'=1 

(Without taking absolute values, the disentanglement 
method would by construction be diagnosed with zero 
leakage.) 

Although this method is likely to be adequate for prac- 
tical applications, it is possible to disentangle E and B 
still better if necessary, at the price of larger error bars. 
The Fisher matrix F generally becomes singular if the 
bands used are much narrower than A£. If the sky cover- 
age is large enough that A£ is narrower than features of 
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interest in the power spectra, is it convenient to choose 
bands of width around At instead of width one (solv- 
ing for each multipole separately). Since this produces 
an invertible Fisher matrix, perfect disentanglement is 
achievable by setting B = F _1 in equation (jig), giving 
only a modest increase in error bars and rather slight an- 
ticorrelations between neighboring bands. The resulting 
measurements can then be made uncorrelated separately 
for E and B, by multiplying by the inverse square roots 
of the corresponding two covariance matrices, thereby 
broadening the £- windows back to their natural widths. 



D. Measuring the cross power spectrum 
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FIG. 11. Window functions for measuring the cross power spec- 
trum are shown for the B2001 case, targeting I = 20 (top) and 
t = 70 (bottom). 

The three cross power spectra Cf E , Cf B and Cf B , 
which we will denote Cf, Cf and Cf for brevity, can 
be measured using the basic decorrelated quadratic esti- 
mators given by equation (|l^) without any modification. 
However, as we will now discuss, this is not necessarily 
the most desirable approach. 

To measure the cross power spectrum Cf , our data 
vector x must contain both unpolarizcd and polarized 
measurements as in equation (§. These may be either 
from a single experiment or from separate unpolarizcd 
and polarized ones. As reviewed in Appendix A, this 
gives a covariance matrix of the form 



/ (TT*) (TQ ( ) (TU*) 
(QT*) <QQ*) (QU ( ) 
> <UT*) (UQ f ) (UU*\ 



(21) 



which generically has non-vanishing elements in all the 
off-diagonal blocks. When we try to measure one of the 
parameters pi corresponding to the cross power spectrum 



Cf, the matrix = dC/dpi will vanish except in the 
T — Q and T —U cross-correlation blocks, since these are 
the only ones that depend on Cf . However, since C is 
a full matrix, the quadratic estimator Qi cx C^P^C -1 
(as well as decorrelated or disentangled variants thereof) 
will also be a full matrix, without any vanishing blocks. 
This means that the estimates qi of the cross power spec- 
trum will involve not only data combinations like TjQk 
and TjUk, but also terms like TjT^ and QjQk- In other 
words, the measured cross-correlation involves, among 
other things, the correlation of the temperature map with 
itself! The same peculiarity applies to the maximum- 
likelihood method, which is simply an iterated version of 
the quadratic method. 

We will now give a simple example illustrating why 
this happens, as well as argue that it is avoidable and 
sometimes undesirable. 



1. A complicated way to measure correlation 

As an illustrative toy model, let us temporarily assume 
that our data vector x consists of merely two numbers, 
the measurements of T and E for a given multipole (£, m) 
extracted from an all-sky map. Consider the simple case 
where Cf ~ Cf ~\ — the general case will follow from 
our result by a trivial scaling. This means that the data 
covariance matrix takes the form 



C = 



(Cf) (Cf) 
(Cf) (Cf) 




(22) 



where r = Cf /[CfCf] 1 ^ 2 is the dimensionless cor- 
relation coefficient between T and E. We have only 
three parameters to measure in our toy example, p = 
(Cf ,Cf ,Cf ), so the matrices P^ = dC/dpt are 



Pi = 




2 = 




3 = 



(23) 



Substituting this and equation (22) into equation (|T 
gives the Fisher matrix 



/I r! - r 

'22 ' 



(!_ r 2)2 



(24) 



-r 1 



with inverse 




(25) 



Substituting the above equations into equation ( |l5|) and 
using the method given by B = F _1 , the resulting unbi- 
ased estimators take the simple form 
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Qi 



Q 2 




Q 



3 — 7T 




(26) 



This is not surprising, since no other estimators could 
possibly be unbiased ((q) = p) when only one spher- 
ical harmonic mode is used. As soon as more modes 
are available, however, things can (and generally do) get 
more complicated. Consider, the case where x consists of 
four rather than two measurements for our multipole, — 
temperature measurements T\ and T 2 and .E-polarization 
measurements E\ and E 2 . The obvious guess would be 
that and estimator of Cf should involve only cross terms 
{TiEx, T X E 2 and T 2 E 2 ). However, terms such as Tf -T| 
and E\ — E\ will have a vanishing average, and can there- 
fore be added to any estimator without biasing it. As 
we saw above, precisely this generically happens in our 
minimum- variance estimator, since it helps reduce the es- 
timator variance if our four measurements have different 
noise variances. 

This can also be understood as follows. Repeating the 
derivation of equation ( |26] ) when there are n measure- 
ment of T and E for our mode, perhaps with different 
variances, shows that the zeroes in equation ( p6| ) will be 
replaced by n x n blocks that generally not vanish - 
merely their traces will. 



2. A simpler way 

Apart from being surprisingly complicated to compute 
and interpret, the basic quadratic estimators of equa- 
tion (|l0|) also have drawback related to systematic er- 
rors. If the estimated cross-power spectrum involves com- 
ponents of the unpolarized and polarized power spectra 
that are supposed to cancel out to reduce the variance, 
there is a risk that systematic errors from these two au- 
tocorrelations propagate into and contaminate the cross- 
correlation measurements. For this reason, an interest- 
ing alternative is to use an estimator for the cross-power 
spectrum that does not have this funny property, i.e., 
that contains only cross- terms. A simple way to con- 
struct such an estimator is to use equation ( |l5|) with the 
fiducial power spectrum Cf set equal to zero. This will 
make C and C _1 block diagonal, so the Q-matrices of 
equation (jl5|) acquire the same block structure as the P- 
matrices. The estimators of Cf therefore involve only 
cross terms. 
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FIG. 12. The low cost of simplicity. The curves show the rela- 
tive increase of the error bars on the four power spectra when the 
fiducial model is replaced by one with CjF = 0, thereby eliminating 
potential systematic errors as described in the text. 

It is noteworthy that this issue applies not only to esti- 
mation of Cf , but to measuring Cj , Cf and Cf as well. 
If the fiducial power spectrum has Cf ^ 0, then estima- 
tors of all four power spectra will involve using combi- 
nations of all three maps (T,Q,U), so setting Cf ^ 
when using equation (^) is an interesting simplifying op- 
tion for all power spectrum estimation. We implicitly did 
so in Section [II B by ignoring the T-map. 



3. Which method is better? 

The price we must pay for this simplification is a slight 
increase in error bars. This is quantified in Figure [l^ for 
the B2001 case. We computed the error bars on the four 
standard power power spectra for the unbiased method 
with both weighting schemes (assuming the true Cf for 
our concordance model and assuming Cf = 0)0. The 
figure shows the ratios (minus one), and illustrates that 
the simplification typically comes at a very low cost — 
an error bar increase of order a percent. In light of this 
and the potential peril of systematic errors, the simpler 
method appears preferable in practice. Returning to the 
most general case of estimating six joint power spectra, 



Specifi cally, we use broad i?-bins as described in Sec- 



II C 4 with A£ = 10 and use the method with B 



tion 

in equation ( |15[ ) to ensure that we are comparing apples with 
apples, i.e., to ensure that we are comparing error bars for 
measurements with identical window functions. 
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it is thus prudent to set all three cross power spectra to 
zero in the fiducial model: Cf = Cj = Cf = 0. 



E. Error bars 

Up until now, we have focused on the issue of window 
functions. Let us now turn to the complementary issue 
of error bars. A large number of papers, for instance 
[|],[|mniEi^ H i have made forecasts for how accurately 
upcoming polarization experiments can constrain cosmo- 
logical parameters. These estimates all assumed that 
the accuracy of the recovered polarized power spectra 
(Cj \Cf \Cf '\Cf) would be given by the 4x4 covari- 
ance matrix 



sky 



2t+l 



( T 2 X 2 T e X e \ 
X 2 Ef E/>Xg 
B 2 
\T £ X £ E t X t %[T t E t + Xf]/ 



(27) 



where / s k y is the fraction of the sky observed and 

P ( = C'[ + (w p )- 1 e e2 ^+ 1 \ P = T,E,B,X, (28) 

if the experimental beam is Gaussian with width 9 in ra- 
dians (the full- width-half-maximum is given by FWHM= 
\/81n2 9). Here the sensitivity measure l/w p is defined 



as 44 1 the noise variance per pixel times the pixel area 
in steradians for P = T,E, B. For the case of the EB 
cross power spectrum, w x — (w T w E ) 1 ^ 2 . Equation ( |2"g| ) 
can be interpreted as simply giving the total power from 
CMB (first term) and detector noise (second term). For 
our examples, we take w T — w E = w B = l/(FWHMcr) 2 , 
where FWHM and a are the resolution and noise values 
listed in Table 1. 

The approximation of equation ( p7| ) has been shown 
to be exact for the special case of complete sky coverage 
(/sky = 1) 0) and the same result follows from the Fisher 
information matrix formalism H. The factor (2£ + l)/ s ky 
can be interpreted as the effective number of uncorrelated 
modes per multipoleQ, and the other factor as giving the 
covariance per mode. 

The approximation that the number of uncorrelated 
modes scales as / s k y is both natural and well-motivated 
|j45|. How accurate is it in practice? Our calculations 



* When the sky coverage / s k y < 1, certain multipoles become 
correlated ^] . This reduces the effective number of uncorre- 
lated modes by a factor / a "^ , thereby increasing the sample 
variance on power measurements by the same factor pi| , ^4| . 
It also smears out sharp features in the power spectrum by 
an amount A£ comparable to the inverse size of the sky patch 
in its narrowest direction [^3) and mixes E and B power as 
discussed in the previous sections. 



enable us to address this issue quantitatively. Figure |i3| 
shows the ratio of the approximate error bars from equa- 
tion (^) to the exact error bars from equation (|l4|) for 
the four power spectra (the ratios of the square roots 
of the corresponding elements on the covariance ma- 
trix diagonals). To ensure a fair comparison, we used 
the uncorrelated method given by equation ( |l5| ) with 
B = F -1 / 2 — the B = F _1 and maximum-likelihood 
methods give larger anticorrelated error bars, whereas 
the B = I method gives smaller correlated ones|] It is 
seen that the approximation of equation (^?j) is generally 
quite accurate when I ^> A£ and £ <C £ max — A£, i.e., 
for multipoles well away from the two endpoints of the 
range computed. This means that forecasts made using 
the approximation of equation (|27j) should be viewed as 
quite accurate except on scales comparable to the survey 
size. 

For the case where all six power spectra are measured 
jointly, equation (^?j) is generalized so that the matrix in 
parenthesis gets replaced by the following one (suppress- 
ing the subscript £ for brevity): 



(29) 



, T 2 X 2 Y 2 
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XY 
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This can either be derived directly from a quadratic es- 
timator as in || or by computing the inverse Fisher in- 
formation matrix as in IpL 



§ The reader interested in implementing any of these meth- 
ods in practice should note that care needs to be taken when 
the bands used are much narrower than A£, since this makes 
F for all practical purposes singular, with many eigenvalues 
so close to zero that rounding errors make them slightly neg- 
ative. Our B2001 example with bandwidth 1 (we are measur- 
ing for each £ separately and A£ ~ 30) is a case in point. For 
the B = F _1//2 case, the window functions are simply pro- 
portional to the rows of F 1//2 , so they are readily computed 
by setting the offending eigenvalues to zero. This determines 
the normalization constants as Mi = 1/ 53 =i (^ 1//2 )»j 'j s i nco 
each window function must sum to unity. The error bar Aq; 
is then equal to Ni. In short, all plots in this paper remain 
well-defined even when F is singular. Evaluating q in practice, 
however, is not possible when F is singular, since it involves 
calculating F -1 ^ 2 numerically. If B is chosen to be a reg- 
ularized version of F -1 ^ 2 in equation (^|), the decorrelation 
method fails in the sense that the band power estimates qi will 
exhibit a slight residual correlation. In conclusion, the power 
spectrum should not be too oversampled for actual data anal- 
ysis. The same obviously applies to the B = F~ method. 
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FIG. 13. The curves show the ratio of the actual error bars to 
those computed with the approximation of equation ( ^7| ) . The ap- 
proximation is seen to be excellent for multipoles more than a cou- 
ple of bandwidths A£ away from the two endpoints of the Grange 
computed (outside shaded regions). 



IV. DISCUSSION 

We have presented a method for measuring CMB po- 
larization power spectra given incomplete sky coverage 
and tested it with a number of simulated examples. 



A. What have we learned? 

The issue of measuring the T power spectrum has 

been extensively treated in prior literature. An added 
challenge when measuring the E and B power spectra 
is leakage between the two caused by incomplete sky cov- 
erage. We quantified this leakage for the first time, and 
found that it is rather insensitive to experimental noise 
levels and depends mainly on geometry. Specifically, we 
found the leakage to depend mainly on the ratio £/A£, 
where A£ is the characteristic window function width 
and scales roughly as the inverse size of the sky patch 
in its narrowest direction. We introduced a disentangle- 
ment method which reduced the leakage to 5% — 10% for 
£ > A£, and described how it could be pushed to zero if 
necessary, at the cost of larger error bars. 

We found that when measuring the X power 
spectrum, the basic quadratic estimator produces a 
surprisingly complicated answer, involving not only 
temperature-polarization cross terms, but also, e.g., au- 
tocorrelations of the temperature map with itself. This is 
unfortunate, since it can make the measured power spec- 
trum more sensitive to systematic errors. Especially if 



the temperature and polarization maps are made by two 
different experiments, systematics should be uncorrelated 
between the two and therefore not contribute to the cross 
term average. The maximum-likelihood method exhibits 
the same problem. This can affect E- and B-polarization 
estimation as well, by giving non-zero weight to the unpo- 
larized map. The problem is eliminated by simply using 
vanishing cross-power in the fiducial model. We find that 
this is desirable in practice, since the resulting informa- 
tion loss causes error bars to increase only by negligle 
amounts, at the percent level. 

Finally, we found that on scales substantially smaller 
than the sky patch, the error bars for the i 7 ' _1 / 2 -method 
were accurately fit by the approximation of equation (|27 
where variance scales inversely with sky coverage. 



B. Relation to other methods 

The quadratic estimator (QE) method is closely re- 
lated to the maximum-likelihood (ML) method: the 
latter is simply the quadratic estimator method with 
B = F _1 in equation ([Hs]), iterated so that the fiducial 
("prior") power spectrum equals the measured one [^l| . 
The ML method has the advantage of not requiring any 
prior to be assumed. The QE method has the advantage 
of being faster (no iteration) and simpler to interpret — 
since it is quadratic rather than highly non-linear, the 
statistical properties the measured band power vector q 
can be computed analytically. This allows the likelihood 
function to be computed directly from q (a s op posed to 
x), in terms of generalized ^-distributions |4q| . 

Both methods are unbiased, but they may differ as 
regards error bars. The QE method can produce inaccu- 
rate error bars if the prior is inconsistent with the actual 
measurement. The ML method Fisher matrix can pro- 
duce inaccurate error bar estimates if the measured power 
spectra have substantial scatter due to noise or sample 
variance, in which case they are unlikely to describe the 
smoother true spectra. A good compromise is therefore 
to iterate the QE method once and choose the second 
prior to be a rather smooth model consistent with the 
original measurement. In addition, as mentioned above, 
we found that it is useful to set the X power spectrum 
to zero in the prior. 

The difference between the QE and ML methods is of- 
ten small in practice, which can be understood as follows. 
Since the QE method can be shown to be lossless if the 
prior equals the truth, thereby minimizing the error bars, 
small departures from the true prior merely destroy in- 
formation to second order. This is also why adopting a 
prior without X power inflated the error bars so little. 

The analysis of weak gravitational lensing data is 
rather analogous to that of CMB polarization, since the 
projected shear field can be decomposed into components 
corresponding to E and B. Recent lensing work cast in 
this language has included both theoretical predictions 
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for various effects [J47|48| and data analysis issues J49j . |50[] . 
In particular, no E/B leakage was detected at the 10% 
level when the ML method was applied to simulated ex- 
amples jl9), which can be understood from our present 
results since the band power bins used were substantially 
broader than A£ Efl] and mainly on scales i ^> A£. 



C. Outlook 



Our basic results are good news: although polarized 
power spectrum estimation adds several complications to 
the non-polarized case, they can all be dealt with using 
the techniques we have described. However, much the- 
oretical work remains to be done. Here are a couple of 
examples of areas deserving further study: 

• The effects of pixel shape and size in the mapmak- 
ing process needs to be quantified, and is likely to 
be more important for the polarized case. 

• Although the methods we have discussed apply 
equally well to measuring the power spectra of non- 
Gaussian signals (the only change is that the error 
bars will no longer be given by equation (jlj)), non- 
Gaussian signals contain more information than is 
contained in their power spectra. Polarized non- 
Gaussian fluctuations are expected from microwave 
foregrounds ]52|-|54|, secondary anisotropics fl55[ , 
topological defects |47| and CMB lensing Q , and 
only a small number of papers have so far addressed 
the issue of how to best measure such non-Gaussian 
signatures in practice Jsf] |58| . 

First and foremost, however, we need a detection of CMB 
polarization! 
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for this work was provided by NASA grant NAG5-9194, 
NSF grant AST00-71213 and the University of Pennsyl- 
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APPENDIX A: COMPUTING THE 
POLARIZATION COVARIANCE MATRIX 

This Appendix is intended for the reader who wishes to 
write software to explicitly compute the polarization co- 
variance matrix. The complete formalism for describing 
CMB polarization was presented in [0J|], and extended 
with a number of useful explicit formulas in |TlJ| . How- 
ever, a number of practical details are not covered in the 
literature, e.g., the rotation angles and degenerate cases, 
so we describe all steps explicitly for completeness. 

Let the 3-dimensional vector Xj denote the three mea- 
surable quantities for the i th pixel: 



(Al) 



The 3x3 covariance matrix between two such vectors at 
different points can be written 




(x.x*) = R(ay)M(ri • f J )R(a J - i ) t , 



(A2) 



Here M is the covariance using a (Q, J7)-convention 
where the reference direction is the great circle connect- 
ing the two points, and the rotation matrices given by 



'10 
R(a) = I cos 2a sin 2a 

. — sin 2a cos 2a 



(A3) 



accomplish a rotation into a global reference frame where 
the reference directions are meridians. The full (3n) x 
(3n) map covariance matri x is readily assembled out of 
the 3x3 blocks of equation ( A2) by looping over all pixel 
pairs. A powerful probe for bugs is making sure that this 
large matrix is positive definite. 



1. Computing the rotation angles 

As we will see, computing the magnitudes of the ro- 
tation angles ay is straightforward, whereas getting the 
correct sign is somewhat subtle. 

The great circle connecting the two pixels has the unit 
normal vector 



(A4) 



Similarly, the meridian passing though pixel i (its refer- 
ence circle for our global (Q, [/)-convention) has the unit 
normal vector 



z x n 



(A5) 



where z = (0, 0, 1) is the unit vector in the z-direction. 
The magnitude of ay, the rotation angle for pixel i, 



is simply the angle between these two great circles, so 
cos a^ = r~y ■ r * . The sign of ay is defined so that a 
positive angle corresponds to clockwise rotation at the 
pixel (at Ti). We therefore compute the cross product of 
the two circle normals, which has the property that 



r, sin ou 



(A6) 



(Since both r y and f* are by construction perpendicu- 
lar to fj, their cross product will be either aligned or 
anti- aligned with rV) Dotting equation (A6) with and 
performing some vector algebra gives 



sinoy = (fy X ?*) • fj CX [(ji XYj) X (?; X z)] ■ f i 



X?j) -z] fj OC ?i 



(A7) 



where the omitted proportionality constants are positive. 
atj therefore has the same sign as the z-coordinate of ry , 
and is given by 



cos 1 (ry • f *) if ry • z > 0, 



OHj = i " ■ ^ n (A8) 

cos 1 (ry • r*) if ry • z < 0. 



For generic pairs of directions, equation (AS) give s th e 



two rotation angles ay and aji needed for equation (|A2| ) 
However, it breaks down for the three special cases fj > 



Yj = 0, f i x z = and Tj x z = 0. If f j x f 3 ■ — 0, the two 



pixels are either identical or on diametrically opposite 
sides of the sky. Hence any great circle through Yi will 
go through ?2 as well. We can choose this circle to be 
the meridian, so no rotation is needed, i.e., ay = aji = 
for this case. Indeed, M comes out diagonal for this case 
by symmetry, with (QiUj) = and (QiQj) = (UiUj), so 
rotations have no effect. 

If Yi x z = 0, then the pixel is at the North or South 
pole, making our the global (Q, C/)-convention undefined. 
The simplest remedy to this problem is to move the pixel 
away from the pole by a tiny amount much smaller than 
the beam width of the experiment. 

The remainder of this section discusses some symmetry 
issue s th at the pragmatic reader may wish to skip. Equa- 
tion ( A8) guarantees a symmetric covariance matrix since 
swapping i and j is equivalent to transposing the result. 
Moreover, the two rotations R(ay) and R(ajj) are nearly 
equal when the two pixels are much closer to each other 
than to the poles (the two rotations would be identical for 
a flat sky), which can be seen as follows, r^ = — ry from 
the antisymmetry of the cross product. Since ?; i=s Yj 
implies that r* r*, and cos _1 (— x) = n — cos^ 1 (x), 



equation (AS) therefore gives a 



(7T-ay) 



The extra rotation by i t has no effect, since the rotation 
matrix of equation (A3) depends on twice the angle. 

There is a more subtle symmetry as well. Flipping the 
overall sign in equation ( A8) will give the same covariance 
matrix if we redefine U with the opposite sign convention. 
The sign convention of equation ( A8) corresponds to the 
standard definition of U. 
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2. Computing the matrix M 

The M-matrix depends only on the angular separation 
between the two pixels. It is given by 



( (TiT-j) {TiQj) (T t U 3 ) 
{TiQj) {Q l Q 3 ) (Q t U 3 ) 
{(TiUj) {UiQi) {UiUj) 



(T i T j )^J2( 2 ^P-)Pe(z)Cf, 



4tt 



(A9) 

(A10) 
(All) 
(A12) 



(QiQj) ^ E (^r 1 ) [Fl 2 W? ^ : )Cf • (A LSI 



(Qi^i) = E 



47T 

2^+1 

47T 



(A15) 



where z = f , -r^ is the cosine of the angle between the two 
pixels under consideration. The equations for (TiQj), 
(QiQj) and (UJJj) are from jll|] (beware a minus sign 
typo in the first one) and those for (TiUj) and (QiUj) 
are from ]59t| . P^ denotes a Legendre polynomial, and 
the functions F 10 , F 12 and F 22 are Q 



F w (z) = 2 

F 12 (z) = 2 
F 22 (z) = 4 



[(£-l)^+l)(£ + 2)] 1 /2 

i^+2)£p2 („\ ( t-4, I 1(1-1) "\ p2f^ 

{£-l)t{£ +!){£ + 2) 

{£ + 2)P 2 _ 1 (z) - (£ - \)zPj\z) 
(£ -!)£(£+ 1)(£ + 2)(1 - z 2 ) 



(A16) 



(A17) 



(A18) 



Here Pi and P 2 denote Legendre polynomials P™ for 
the cases m = and m = 2, respectively, which are 
conveniently computed using the recursion relations BO] 



1 



for I = 0, 



for*=l, (A19) 



(2f-l)^_ 1 (z)-(^-l)P f _ 2 (z) for ^ > 2 

3(1 -z 2 ) 



p/(z) = <; 5zp 2 2 



for £ = 2, 
for I = 3, ( A20 ) 
for f > 4. 



The d ivi sion by (1 — z 2 ) unfortunately causes expressions 
( A16Q — ( A1S| ) to blow up numerically when z = ±1, i.e., 




for correlations at zero separation or between diametri- 
cally opposite pixels in the sky Iplj . Taking the appro- 
priate limits for these cases gives 

(A21) 
(A22) 

i, 

if z = 1, 

(A23) 

±(-l) £ ifz = -l. ^ ^ 

3. Including variable angular resolution 

If the experimental beam is rotationally symmetric, 
its effect is straightforward to include even if the beam 
size and radial profile varies between pixels. This com- 
plication is particularly important for the case of cross- 
polarization, where it may be desirable to correlate po- 
larization maps from one experiment with a temperature 
map from another experiment that happens to have dif- 
ferent angular resolution. 

Let Bn denote the coefficients obtained from expand- 
ing the beam profile corresponding to the measurement 
Xi in Legendre polynomials. For a Gaussian beam, these 
coefficients are accurately described by the well-known 
approximation 



e 2 



(A24) 



where the rms beam width 9i is related to the full- width- 
half-maximum by 6i =FWHMi/V8 In 2. To compute the 
correlation (xiXj) in equation ( |A2| ), the six power sp ectra 
Cf are si mply replaced by Cf BuBjg in equations ( A1C ) 
through (|A15). 
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